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Background: The purpose of this study was to determine the relative sensitivity and specificity 
of 10-2 visual fields (10-2 VFs), multifocal electroretinography (mfERG), and spectral domain 
optical coherence tomography (SD-OCT) in detecting hydroxychloroquine retinopathy. 
Methods: A total of 121 patients taking hydroxychloroquine (n=l 19) or chloroquine (n=2) with 
10-2 VF, mfERG, and SD-OCT tests were retrospectively reviewed. Rates of test abnormality 
were determined. 

Results: Retinopathy was present in 14 and absent in 107. Eleven of 14 (78.6%) patients with 
retinopathy were overdosed. Twelve (85.7%) had cumulative dosing greater than 1,000 g. The 
sensitivities of 10-2 VF, mfERG, and SD-OCT in detecting retinopathy were 85.7%, 92.9%, and 
78.6%, respectively. The specificities of 10-2 VF, mfERG, and SD-OCT in detecting retinopathy 
were 92.5%, 86.9%, and 98.1%, respectively. Positive predictive values of 10-2 VF, mfERG, and 
SD-OCT in detecting retinopathy were less than 30% for all estimates of hydroxychloroquine 
retinopathy prevalence. Negative predictive values were >99% for all tests. 
Conclusion: Based on published estimates of hydroxychloroquine retinopathy prevalence, 
all three tests are most reliable when negative, allowing confident exclusion of retinopathy in 
patients taking <6.5 mg/kg/day. Each test is less useful in allowing a confident diagnosis of 
retinopathy when positive, especially in patients taking <6.5 mg/kg/day. 
Keywords: hydroxychloroquine, chloroquine, retinopathy, multifocal electroretinography, 
spectral domain optical coherence tomography, ideal body weight, toxicity 

Introduction 

Individual physicians and national health care policymalcers disagree on the need to 
screen for hydroxychloroquine and chloroquine retinopathy. For hydroxychloroquine, 
the published consensus in the UK is that screening is unnecessary given the rarity 
of toxicity in patients prescribed less than 6.5 mg/kg/day based on lean body mass 
and the absence of a test proven to detect retinopathy at a reversible stage. In the 
USA, toxic dosing has been estimated to occur in at least 12% of patients prescribed 
hydroxychloroquine, and retinopathy to occur in approximately 1% of patients on 
the drug for more than 5 years. Therefore, screening for detection and correction 
of toxic dosing and for detection of retinopathy are considered cost-effective and are 
the standard of care.^"' 
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In 20 1 1 , revised guidelines for screening hydroxychloro- 
quine retinopathy were published by the American Academy 
of Ophthalmology that placed emphasis on three ancillary 
tests, ie, multifocal electroretinography (mfERG), spectral 
domain optical coherence tomography (SD-OCT), and fun- 
dus autofluorescence.^ It was recommended that at least one 
of these tests be performed, when available, in addition to 
10-2 visual field (10-2 VF) testing, which was the standard 
screening test supplementing ophthalmic examination since 
previous guidelines of 2002.'-^ In practice, the use of fundus 
autofluorescence is rare, even when it is available.^ Therefore, 
mfERG and SD-OCT are the most commonly used tests for 
hydroxychloroquine screening besides 10-2 VFs. 

The revised guidelines influence the behavior of thou- 
sands of ophthalmologists in the USA and in other countries 
where screening for hydroxychloroquine retinopathy is 
standard, making the performance characteristics of these 
ancillary tests important. Yet in 2005, there were no such 
data for mfERG,'" and little subsequently. Fundus autofluo- 
rescence and SD-OCT were not used as ancillary tests until 
2006 and 2009, respectively." Since then, scant information 
addressing their relative performance as screening tests has 
been published, and what is available has been based on few 
patients.'' It is the purpose of this article to report on a series 
of patients taking hydroxychloroquine in whom the relative 
sensitivity and specificity of 10-2 VF, mfERG, and SD-OCT 
in the detection of retinopathy have been determined. 

Materials and methods 

This is a retrospective study of 121 patients screened for 
hydroxychloroquine and chloroquine retinopathy in a private 
practice having 26 ophthalmologists and three optometrists. 
The inclusion criteria were that each patient have good qual- 
ity 10-2 VF (false-positive and false-negative responses less 
than 20%), mfERG (no 60 cycle noise or eccentric fixation), 
and SD-OCT (no artifacts interfering with analysis of retinal 
layers). The list of patients was obtained by queiying the 
practice's electronic medical records using the International 
Classification of Diseases, 9th Revision (ICD-9 code) 
V58.69. Data extracted from the charts included sex, age, 
diagnosis, date hydroxychloroquine was started, dose and 
changes over time, ancillary tests used, height, weight, pre- 
existing macular abnormalities, and macular description. 
Patient height and weight were self-reported. Ideal body 
weight was generally calculated from height by clinicians 
using the National Heart Lung and Blood Institute table. '* For 
hydroxychloroquine and chloroquine, doses >6.5 mg/kg/day 
and >3.0 mg/kg/day, respectively, are referred to as 



potentially toxic, not because lower doses cannot be 
associated with maculopathy, but because of the acknowl- 
edged higher risk of doses in this range. *"^' In this paper, 
doses <6.5 mg/kg/day and <3.0 mg/kg/day for hydroxy- 
chloroquine and chloroquine, respectively, are referred to as 
typically nontoxic, not because doses in this range cannot be 
associated with maculopathy, but because they typically are 
jjQ^ 18,21-23 jhreshoids for cumulative dosage at which risk of 
retinopathy is purported to increase vary in the literature; we 
have used 1,000 g and 300 g for hydroxychloroquine and 
chloroquine, respectively.'''^'' 

In studies in which the sensitivity and specificity is to 
be determined, a gold standard against which the tests will 
be graded must be defined.^^ For the purposes of this work 
the definition of hydroxychloroquine retinopathy (the gold 
standard) was that the drug was discontinued by the oph- 
thalmologist and the prescribing physician because retin- 
opathy was considered to be present based on the totality 
of the clinical evidence. This gold standard has been used 
before.^''-^^ 

To avoid problems of correlated results between eyes, 
only one eye was included per patient.^* When only one of 
two eyes had good quality testing, that eye was chosen. When 
two eyes had good quality testing, a random number genera- 
tor was used to pick which of the two was included. 

To calculate adjusted daily dose, the lesser of the ideal 
body weight based on height and actual body weight was 
used as the operative weight and the formula was daily dose 
(in mg/day)/[operative weight (in pounds)/(2.2 pounds/kg)]. 
This number has units of mg/kg/day.' In five cases, only 
the height (and hence the ideal body weight) was available, 
and in one case only the actual body weight was available. 
In these cases, the single available datum was used for the 
calculation of adjusted daily dose. 

VF testing was done with the 10-2 program of the Hum- 
phrey visual field analyzer (Carl Zeiss Meditec, Dublin, CA, 
USA). Visual fields were sometimes performed with a III red 
test object using a FASTPAC protocol and in other cases 
with a III white test object using a SITA-FAST protocol. 
The results for both types were pooled. 

Multifocal electroretinograms were performed following 
International Society for Clinical Electrophysiology of Vision 
guidelines.^' The mfERGs were performed with the Espion 
system (Diagnosys LLC, Lowell, MA) running under version 
6+ software. DTL fiber electrodes were used. The patients' 
eyes were dilated and topical anesthetic was used. The 
stimulation pattern was dictated by an m-sequence control- 
ling the illumination of 61 contiguous hexagons subtending 
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30 degrees of VF to either side of fixation. The luminances of 
the white and black hexagons were 1 ,000 cd/m^ and 0 cd/m^, 
respectively. Signals were processed through a 10-100 hertz 
bandpass filter. The first order kernel response was analyzed. 
The waveform amplitudes refer to the voltage measured from 
the trough of the wave to the peak of the Pj wave. The 
displays are shown in the retinal view (as though looking at 
a fundus photograph, not as though looking at a VF). 

In the interpretation of mfERGs in the context of hydroxy- 
chloroquine toxicity screening, various criteria for toxicity 
have been used, including critical values for amplitude, 
implicit time, and ring ratios. In this study, the definition 
of an abnormal mfERG was that one or more of the following 
was true: Rj, R^, or R^ amplitudes less than the lower limit 
of normal determined in 32 normal volunteers tested on the 
same system or an R/Rj ratio >2.6.^-^^ 

SD-OCTs were obtained with either the Cirrus (Carl 
Zeiss Meditec) or Spectralis (Heidelberg Engineering, 
Carlsbad, CA, USA) systems. In inteipreting the SD-OCT 
in the context of screening for hydroxychloroquine toxicity, 
various criteria have been used including perifoveal loss of 
the ellipsoid zone, loss of the retinal pigment epithelial layer, 
and generalized macular thinning.'^ In this work, loss of the 
perifoveal ellipsoid zone and discontinuity of the retinal 
pigment epithelial layer were used as the definition of an 
abnormal SD-OCT test. Because only morphologic data were 
used in the interpretation of the SD-OCTs, the results were 
pooled from both machines. 

Sensitivity, specificity, positive predictive value, and 
negative predictive value have standard definitions. Sensi- 
tivity and specificity can be determined from a case-control 
study design as was employed here, but positive and negative 
predictive values depend on prevalence, which can only be 
determined from a prospective population-based sample. No 
such study exists, therefore prevalence estimates that have 
been published in the literature have been used instead in the 
calculation of positive and negative predictive values. This 
is consistent with other studies in this field in which posi- 
tive predictive values and negative predictive values were 
calculated based on estimated prevalences in the absence of 
reliable population-based data.^*" For ease of reference, the 
definitions follow: 

Sensitivity = (number of true positives)/(number of true 
positives + number of false negatives) 

Specificity = (number of true negatives)/(number of true 
negatives + number of false positives) 



Positive predictive value = (sensitivity)(prevalence)/ 

[(sensitivity)(prevalence) 
+ (1 - specificity)(l - prevalence)] 
Negative predictive value = (specificity)(l - prevalence)/ 

[(specificity)(l - prevalence) 
-h (1 - sensitivity) 
(prevalence)] 

Age, weight, and height were non-normally distributed. 
Therefore, the descriptive statistics presented are non- 
parametric. Statistical comparisons of were done with the 
Kruskal-Wallis test or Fisher's exact test; all tests were two- 
tailed. IMP 4.0 software (SAS Institute Inc, Cary, NC, USA) 
was used for statistical calculations and testing, /"-values 
are uncorrected for multiple hypothesis testing. Waiver of 
informed consent and waiver of Health Insurance Portability 
and Accountability Act authorization were approved by the 
Presbyterian Hospital institutional review board (number 
12053). This study was conducted in accordance with the 
Declaration of Helsinki. 

Results 

Of the 121 patients, 110 (91%) were female. One hundred 
and nineteen patients were screened for hydroxychloroquine 
retinopathy. Two patients had been taking chloroquine. 
Because the pathophysiology, screening methodologies, 
and clinical issues regarding toxicity are analogous for both 
drugs, the patients taking chloroquine were included in the 
analysis. The median age was 63 years (range 21-90; inter- 
quartile range [IQR] 50, 72). 

Height was determined in 120 patients (99%). The median 
height was 64 inches (range 59-74; IQR 63, 66). Nineteen 
percent were 5 feet 3 inches tall or less, the threshold ideal 
body weight at which the dosage of 400 mg/day of hydroxy- 
chloroquine becomes toxic. 

Weight was determined in 116 patients (96%). The 
median weight was 163 pounds (range 135-185; IQR 100, 
290). Twenty-two percent weighed 135 pounds or less, a lean 
body mass threshold below which the hydroxychloroquine 
dosage of 400 mg/day becomes toxic. 

The daily dose of hydroxychloroquine was determined 
in 114 patients (94%). Sixty-eight (59.6%), 35 (30.7%), 
and 11 (9.6%) were taking 400 mg/day, 200 mg/day, 
and 300-399 mg/day, respectively. The patients taking 
300-399 mg/day were taking 400 mg/day for most days 
of the week and skipping the other days' doses to give the 
stated average daily dose. Cumulative dosages could be 
determined in 96 patients (79%). The median dose was 984 g 
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(range 17-3,942; IQR 426, 1,460). Fifty-six (49.4%) patients 
had taken a cumulative dose of ^ 1 ,000 g, a cumulative dose 
threshold cited as increasing the risk of retinopathy.''-*'^^ 

Pre-existing maculopathy was present in 27 patients. 
Twelve had mild age-related retinal pigment epithelial mot- 
tling, 12 had drusen, one had a few hemorrhages from a 
remote macular branch retinal vein occlusion, one had a few 
microaneurysms attributed to mild nonproliferative diabetic 
retinopathy, and one had previously undergone vitrectomy 
for a macular epiretinal membrane. 

Of the 121 patients in the study, 107 (88%) did not have 
hydroxychloroquine retinopathy, were taking the drug, were 
under active monitoring for toxicity, and were given clear- 
ance to continue taking the drug. Fourteen patients (12%; 
12 taking hydroxychloroquine and two taking chloroquine) 
were deemed to have toxic retinopathy and were taken off 
their medication. All 14 patients with retinopathy were 
female. Adjusted daily dosages, cumulative dosages, and 
results of testing are shown in Table 1 . Eleven of 14 (78.6%) 
were overdosed. Twelve (85.7%) had cumulative dosing 
above thresholds for increased risk. 

Renal insufficiency was present in one patient (0.8%) and 
another had had surgery for renal stones, but was not known 
to have insufficiency; both had hydroxychloroquine retinopa- 
thy. Liver disease was present in one patient (0.8%), who did 
not have hydroxychloroquine retinopathy. Pre-existing macu- 
lopathy was present in 2/14 (14.3%) and 25/107 (23.4%) 
patients with and without hydroxychloroquine retinopathy, 
respectively (/'=0.7332, Fisher's exact test). The median 
ages were 61 years (IQR 46, 68) and 63 years (IQR 51, 72) 
for those with and without hydroxychloroquine retinopathy, 
respectively (P=0.6942, Kruskal-Wallis test). The proportion 
of patients aged 60 years or older did not differ between those 
with or without retinopathy (data not shown). 

There were 115 patients taking hydroxychloroquine 
with data on adjusted daily dose (12 with retinopathy and 
103 without retinopathy). The median adjusted daily doses 
were 6.7 mg/kg/day (IQR 6.2, 7.1) and 5.6 mg/kg/day (IQR 
3.6, 6.1) for the patients with and without retinopathy, 
respectively (P=0.0065, Kruskal-Wallis test). The proportion 
of these patients taking >6.5 mg/kg/day was 9/12 (75.0%) 
and 17/103 (16.5%) for those with and without retinopathy, 
respectively (P<0.0001, Fisher's exact test). The cumulative 
doses of hydroxychloroquine were known for 96 patients 
(12 with retinopathy and 84 without retinopathy). The median 
cumulative doses were 1 ,635 g (IQR 1,040, 2,920) and 859 g 
(IQR 365, 1,460) for patients with and without retinopathy, 
respectively (/'=0.0344, Kruskal-Walhs test). The proportion 
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Table 2 Multifocal electroretinogram indices by retinopathy 



mfERG variable 


No retinopathy 
group (nV/mm^) 


Retinopathy 
group (nV/mm^) 


Ratio of medians 


P-value 


Ri N|-P| amplitude 


26.1 (20.9, 31.1) 


16.7 (9.7, 22.7) 


1.56 


0.0 1 26 


Rj N|-P| amplitude 


14.1 (1 1.8, 16.7) 


7.8 (3.9, 9.6) 


1.8! 


0.0014 


N|-P| amplitude 


8.6 (6.8, 10.0) 


5.2 (3.0, 6.2) 


1.65 


<0.000l 


R^ N|-P| amplitude 


6.7 (5.0, 7.6) 


4.5 (3.1, 6.3) 


1.49 


<0.000l 


R^ N|-P| amplitude 


5.6 (4.2, 6.8) 


3.8 (2.6, 5.6) 


1.47 


0.004! 


R/Rj ratio 


1.83 (1.64, 2.01) 


2.1! (1.83, 2.93) 


1.15 


0.0130 



Notes: Ratio of medians, median of the no retinopathy group/median of the retinopathy group. The entries in columns two and three are medians with interquartile ranges. 
Abbreviations: nV, nanovolts; mfERG, multifocal electroretinography. 



of patients with cumulative doses of hydroxychloroquine 
1,000 g or more was 10/12 (83.3%) and 37/84 (44.0%) for 
those with and without retinopathy, respectively (i^O.Ol 35, 
Fisher's exact test). 

The mfERG ring averaged amplitudes and the R,/R2 ^^^^^ 
were different between the patients with and without hydroxy- 
chloroquine retinopathy (Table 2). The differences with the 
greatest separation were for rings R^ and R^ followed by 
Rj and R^. The separation was least for R^ and R/R^ (Table 2). 
Of the mfERG variables, a low amplitude of Rj, R^, or R^ was 
more sensitive (13/14, 92.8%) than the R/R^ ratio (5/14, 
35.7%) in detecting retinopathy (Table 2). 

The sensitivity and specificity of the three ancillary tests 
is shown in Table 3. The order of sensitivity of the tests 
in detecting hydroxychloroquine retinopathy was mfERG 
(92.9%) >10-2 VF (85.7%) >SD-OCT (78.6%). The order 
of specificity was reversed, ie, SD-OCT (98.1%) >10-2 VF 
(92.5%) >mfERG (86.9%). The combinations of 10-2 VF 
and mfERG or SD-OCT and mfERG were more sensitive 
(100%) than either test alone. Table 4 shows the positive and 
negative predictive values of the three tests for the range of 
prevalences that have been reported.''''''*^^ All three tests 
share the trait of having a high negative predictive value. 
The 10-2 VF and mfERG have a low positive predictive 
value for the most probable prevalences (0.1% and 1%). The 
SD-OCT is distinguished by its higher positive predictive 



Table 3 Sensitivity and specificity of ancillary tests for hydroxy- 
chloroquine retinopathy 



Ancillary test 


Sensitivity (%) 


Specificity (%) 


SD-OCT 


78.6 


98.1 


10-2 VF 


85.7 


92.5 


mfERG 


92.9 


86.9 


10-2 VF + mfERG 


100 


82.2 


10-2 VF + SD-OCT 


85.7 


92.5 


mfERG + SD-OCT 


100 


86.0 



Abbreviations: VF, visual field; mfERG, multifocal electroretinogram; SD-OCT, 
spectral domain optical coherence tomography. 



value at these more probable prevalences compared with 
the other tests. 

Examples of consistent and discrepant performance 
of the three screening tests are shown in Figures 1-3. 
Figure 1 shows a patient with advanced hydroxychloroquine 
retinopathy in whom all three tests were abnormal and con- 
sistent in detecting retinopathy. Figure 2 shows a patient in 
whom the mfERG detected retinopathy, but the 1 0-2 VF and 
SD-OCT were normal (false negatives). Figure 3 shows a 
patient in whom the 10-2 VF and SD-OCT detected retin- 
opathy, but the mfERG was normal (false negative). 

Discussion 

The sample of patients taking hydroxychloroquine in this 
series was comparable with others reported in the literature. 
Ninety-one percent of the patients were female, similar to 
the 77%-94% reported previously. '^''^^^ The patients in 
this series who developed hydroxychloroquine retinopa- 
thy were all female, also consistent with the heavy female 
preponderance noted previously. Adjusted daily dose 
and cumulative dose were both associated with retinopathy. 
The association was stronger for adjusted daily dose. Renal 
disease was a factor in one patient with retinopathy. This 
patient was taking a toxic dose for a person with normal 
renal function; the toxicity may have been exacerbated by 
her renal dysfunction. Age and pre-existing maculopathy 
were not associated with retinopathy. 

Multifocal electroretinography was first shown to be 
abnormal in advanced hydroxychloroquine retinopathy in 
1999.*'' In a relatively small number of cases, it was shown 
subsequently to be able to detect cases of retinopathy before 
visual acuity, color vision testing, Amsler grid testing, 
Goldmann perimetry, and full field electroretinographic 
testing showed abnormalities. It was suggested based 
on small case series, but not demonstrated in a sample of 
sufficient size, that it was more sensitive than 10-2 VF 
testing.'"'** An additional advantage of mfERG according to 
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Table 4 Positive and negative predictive values of ancillary tests for hydroxychloroquine retinopathy across a range of estimated 
prevalences 



Estimated prevalence 


Test 


Positive predictive 


Negative predictive 


of hydroxychloroquine retinopathy 




value (%) 


value (%) 


0.1% 


10-2 VF 


1 


100 




mfERG 


1 


100 




SD-OCT 


4 


100 


1% 


10-2 VF 


10 


100 




mfERG 


7 


100 




SD-OCT 


29 


100 


3% 


10-2 VF 


26 


100 




mfERG 


18 


100 




SD-OCT 


56 


99 


5% 


10-2 VF 


38 


99 




mfERG 


27 


100 




SD-OCT 


69 


99 



Notes: Positive predictive value is the probability of having hydroxychloroquine retinopathy given a positive test under the assumed prevalence of hydroxychloroquine 
retinopathy in the sample to which the patient belongs. Negative predictive value is the probability of not having hydroxychloroquine retinopathy given a normal test under 
the assumed prevalence of hydroxychloroquine retinopathy In the sample to which the patient belongs. 

Abbreviations: VF, visual field; SD-OCT, spectral domain optical coherence tomography: mfERG, multifocal electroretinography. 



its proponents is that it is more objective than 10-2 VF testing, 
inasmuch as it does not depend on the patient's response of 
pushing a button, which can be influenced by factors other 
than seeing the stimulus. 

What has not been emphasized, however, is that sub- 
jectivity is not absent from mfERG testing, but has only 
been shifted from acquisition of the data to its interpreta- 
tion. That is, there is no consensus on the definition of an 
abnormal multifocal electroretinogram. Some clinicians 
subjectively compare the waveforms of the patient to a single 
normal control in a nonstatistical visual comparison,""" others 
compare waveform amplitudes for each hexagon to control 
values,"*^ others compare amplitudes of waveforms averaged 
over multiple hexagons arranged in rings,^"-^'* '"'^'' others use 
ring ratios,^ '"* and others analyze the color difference plot in 
which patient data are compared with normal unpublished 
data provided by the machine manufacturer, the details of 
which are unknown.^' Some examine amplitudes only, but 
others compare waveform latencies. ^"-'^ Some use internal 
normal controls for comparisons.'" Others use population 
norms supplied by instrument manufacturers or taken from 
published literature.'"' Some use patterns of mfERG change, 
but their classifications of patterns differ.'"'''' In some cases 
it appears that a rough gestalt is obtained after viewing the 
hexagonal waveforms, without any defined criteria being 
applied.^' 

The sensitivity and specificity of mfERG testing will 
depend on the definition chosen for mfERG abnormality.''' For 
example, Lyons and Sevems state "In order to maximize the 
specificity of the testing, the 99th percentile one-tailed limits 



are used".'" They also excluded the peripheral loss pattern 
from their calculations of the prevalence of toxicity, because it 
was not seen in patients taking > 1 , 1 25 g cumulative dose. As 
another example, Xiaoyun et al defined an abnormal mfERG 
as having either a low amplitude or a low Pj amplitude for 
the Rj or R^ average waveform and found that 70% of patients 
with rheumatoid arthritis taking chloroquine had an abnormal 
mfERG. ^" Had they required in addition an R/R^ ratio >2.6, 
the percentage would probably have been lower. With certain 
low threshold definitions of mfERG abnormality, 20%-70% 
of patients taking hydroxychloroquine and chloroquine 
develop mfERG abnormalities. ^"^^ 

There is little published evidence for the relative sen- 
sitivity and specificity of mfERG compared with 10-2 VF. 
Maturi et al reported a series of 1 5 patients (30 eyes, some 
with and others without evidence of toxicity by the different 
testing modalities), in whom both mfERG and Humphrey 
30-2 perimetry were both available.'" Even though the 
30-2 program is an inferior method of screening compared with 
the 10-2 program, '^'^'' there was no clear evidence of greater 
sensitivity of one test compared with the other.'" Normal 
mfERG responses and abnormal 30-2 VFs were found in 
eleven eyes compared with abnormal mfERG responses and 
normal 30-2 VFs in four eyes. The correlation between eyes 
of the same patient was not determined in this study. There is 
even less evidence for the relative sensitivity and specificity 
of SD-OCT compared with 10-2 VF or mfERG. 

When one ancillary test is used as the gold standard 
and another is compared with it, the reasoning behind the 
definition of sensitivity and specificity of a test becomes 
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Figure 2 Hydroxychloroquine retinopathy detected by multifocal eiectroretlnography, but not by 10-2 visual field testing or spectral domain optical coherence 
tomography. 

Notes: This 7 1 -year-old woman with dermatologic lupus and arthritis had taken 400 mg/kg/day of hydroxychloroquine for 37 years. She was 5 feet tall with an Ideal body 
weight of 1 23 pounds. Her adjusted daily dose was 7. 1 5 mg/kg/day. Her cumulative dose was 2,920 g. Funduscopy was normal. All images are of the left eye (right eye similar). 
(A) The multifocal electroretinogram showed N|-P| amplitudes that were below the lower limit of normal for the central hexagon {R|) and rings and (dashed circled 
area). The waveforms in the Individual hexagons are flat (black, solid circled area). The averaged waveforms are so small that the machine-placed cursors are erroneously 
positioned (black arrows). (B) The 1 0-2 visual field from December 4, 2009 and December 1 6, 20 1 I are normal. There are some elevated thresholds at Isolated points In the 
field, but they are not reproducible. (C) Spectral domain optical coherence tomography shows an Intact ellipsoid zone line throughout the scan. 

Abbreviations: 3D, three dimensional; RMS, root mean square: PSD, pattern standard deviation; MD, mean deviation; FN, false negatives; FP, false positives; T, temporal; 
N, nasal; LE, left eye. 



circular." Therefore, in this paper, as in others,^*'^' the gold 
standard was chosen to be the physician's action of stopping 
hydroxychloroquine based on the totality of the evidence. 
With this gold standard, it becomes possible to compare 
with less bias the relative sensitivity and specificity of the 
three most commonly used ancillary tests used in screening 
for hydroxychloroquine retinopathy. 

Although none of the 10-2 VF, mfERG, or SD-OCT 
tests has clearly superior performance characteristics as a 
standalone screening test, there are differences in the repro- 
ducibility of the three tests. Retinal thickness measurements 
with optical coherence tomography are reproducible, with 
SD-OCT coefficients of variation for macular thickness 
are <3.5%.^^~^^ In contrast, neither 10-2 VF testing nor 
mfERG is highly reproducible. The coefficient of variation 
for mfERG amplitudes ranges from 10% to 35%. There 



are no reproducibility data published for 10-2 VF testing, 
but the clinical reality of variability and difficulty of inter- 
pretation by clinicians is widely cited. This makes 
the SD-OCT of particular worth relative to the other two 
ancillary tests. By combining the three tests, it was possible 
to achieve 1 00% sensitivity for detection of hydroxychlo- 
roquine retinopathy. One hundred percent sensitivity was 
achieved for the combinations 10-2 VF + mfERG and SD- 
OCT + mfERG, but not for 10-2 VF + SD-OCT, which had 
a sensitivity of 89%. 

Reliable data on prevalence of hydroxychloroquine 
retinopathy among users would need to come from a 
population-based study with standard examination techniques 
and definitions of retinopathy. Such a study has not been done, 
nor is it likely to be done in the fiiture." Therefore, published 
prevalence estimates provide a crude range within which the 
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true prevalence probably lies. However, regardless of the 
estimate used, the positive predictive values of 10-2 VFs, 
mfERGs, and SD-OCTs are low and the negative predic- 
tive values are high. The clinical message is that these tests 
rarely misclassify a patient who truly has hydroxychloroquine 
retinopathy as healthy, but they are subject to misclassifying 
healthy persons as having hydroxychloroquine retinopathy. 
The appropriate clinical response is to make sure that the 
patient's sole modifiable risk factor, the adjusted daily dose, 
lies in a range of higher safety, to use more than one test to 
assess suspicious cases, and to evaluate the patient longitudi- 
nally with shortened follow-up intervals in suspicious cases. 
If the suspicion is insufficient to stop the drug, but higher 
than responding by simply intensifying monitoring, the daily 
dose can be reduced. Several have advocated a lower adjusted 
daily dose threshold of 6.0 mg/kg/day in cases in which risk 
reduction is important. The risk of retinopathy decreases 
as dosing is lowered. 

The present study has limitations. There were only 
14 cases of hydroxychloroquine and chloroquine retinopathy 
for which all three tests were available. We can assess the 
marginal error of our sensitivity statistic if we assume that a 
test with a calculated sensitivity of 92.9% such as the mfERG 
performed poorly on the next case of retinopathy. In this case, 
the calculated sensitivity of the test would fall to 86.7%, an 
undesirably large marginal error of 6.2%. This is not the case 
for the specificity, for which an analogous calculation in the 
instance of the 10-2 VF shows a marginal error of 0.8%. 
It would be helpful to have studies with larger samples of 
hydroxychloroquine retinopathy for more robust estimation 
of test sensitivities. Nevertheless, this study provides the 
largest series in the literature attempting to estimate the 
sensitivities and specificities of these commonly used tests. 
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